Deep Learning Generic Features for Cross-Media Retrieval
نویسندگان
چکیده
Cross-media retrieval is an imperative approach to handle the explosive growth of multimodal data on the web. However, how to effectively uncover the correlations between multimodal data has been a barrier to successful retrieval of cross-media data. The traditional approaches learn the connection between multiple modalities by direct utilization of hand-crafted low-level heterogeneous features and the learned correlation are merely constructed in terms of high-level feature representation. To well exploit the intrinsic structures of multimodal data, it is essential to build up an interpretable correlation between multimodal data. In this paper, we propose a deep model to learn the highlevel feature representation shared by multiple modalities for cross-media retrieval. We learn the discriminative high-level feature representation in a data-driven manner before faithfully encoding the multimodal correlations. We use the large-scale multimodal data crawled from Internet to train our deep model and evaluate its effectiveness on cross-media retrieval based on NUS-WIDE dataset. The experimental results show that the proposed model outperforms other state-of-the-arts approaches.
منابع مشابه
Learning a Semantic Space by Deep Network for Cross-media Retrieval
With the growth of multimedia data, the problem of cross-media (or cross-modal) retrieval has attracted considerable interest in the cross-media retrieval community. One of the solutions is to learn a common representation for multimedia data. In this paper, we propose a simple but effective deep learning method to address the cross-media retrieval problem between images and text documents for ...
متن کاملCross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks
Inspired by the progress of deep neural network (DNN) in single-media retrieval, the researchers have applied the DNN to cross-media retrieval. These methods are mainly two-stage learning: the first stage is to generate the separate representation for each media type, and the existing methods only model the intra-media information but ignore the inter-media correlation with the rich complementa...
متن کاملWord2VisualVec: Cross-Media Retrieval by Visual Feature Prediction
This paper attacks the challenging problem of cross-media retrieval. That is, given an image find the text best describing its content, or the other way around. Different from existing works, which either rely on a joint space, or a text space, we propose to perform cross-media retrieval in a visual space only. We contribute Word2VisualVec, a deep neural network architecture that learns to pred...
متن کاملA Machine Learning based Music Retrieval and Recommendation System
In this paper, we present a music retrieval and recommendation system using machine learning techniques. We propose a query by humming system for music retrieval that uses deep neural networks for note transcription and a note-based retrieval system for retrieving the correct song from the database. We evaluate our query by humming system using the standard MIREX QBSH dataset. We also propose a...
متن کاملA Modified Grasshopper Optimization Algorithm Combined with CNN for Content Based Image Retrieval
Nowadays, with huge progress in digital imaging, new image processing methods are needed to manage digital images stored on disks. Image retrieval has been one of the most challengeable fields in digital image processing which means searching in a big database in order to represent similar images to the query image. Although many efficient researches have been performed for this topic so far, t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016